Skip to content

deepspeed-chat: train v_head when only optimizing lora#758

Merged
tjruwase merged 2 commits intodeepspeedai:masterfrom
mosheisland:8_train_v_head_lora
Oct 16, 2023
Merged

deepspeed-chat: train v_head when only optimizing lora#758
tjruwase merged 2 commits intodeepspeedai:masterfrom
mosheisland:8_train_v_head_lora

Conversation

@mosheisland
Copy link
Contributor

When using only optimize lora, we still need to train the v_head parameter.

Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8

When using only optimize lora, we still need to train the v_head parameter.

Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8
Signed-off-by: Moshe Island <misland@habana.ai>
@tjruwase tjruwase merged commit 5161c0f into deepspeedai:master Oct 16, 2023
@mosheisland mosheisland deleted the 8_train_v_head_lora branch October 17, 2023 06:40
hwchen2017 pushed a commit that referenced this pull request Jun 8, 2025
When using only optimize lora, we still need to train the v_head parameter.

Change-Id: I252c3ee69819997bf336482c6779b070f2e76df8

Signed-off-by: Moshe Island <misland@habana.ai>
Co-authored-by: Moshe Island <misland@habana.ai>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants